The variation of Zipf’s law in human language

نویسنده

  • R. Ferrer i Cancho
چکیده

Words in humans follow the so-called Zipf’s law. More precisely, the word frequency spectrum follows a power function, whose typical exponent is β ≈ 2, but significant variations are found. We hypothesize that the full range of variation reflects our ability to balance the goal of communication, i.e. maximizing the information transfer and the cost of communication, imposed by the limitations of the human brain. We show that the higher the importance of satisfying the goal of communication, the higher the exponent. Here, assuming that words are used according to their meaning we explain why variation in β should be limited to a particular domain. From the one hand, we explain a non-trivial lower bound at about β = 1.6 for communication systems neglecting the goal of the communication. From the other hand, we find a sudden divergence of β if a certain critical balance is crossed. At the same time a sharp transition to maximum information transfer and unfortunately, maximum communication cost, is found. Consistently with the upper bound of real exponents, the maximum finite value predicted is about β = 2.4. It is convenient for human language not to cross the transition and remain in a domain where maximum information transfer is high but at a reasonable cost. Therefore, only a particular range of exponents should be found in human speakers. The exponent β contains information about the balance between cost and communicative efficiency. PACS. 87.10.+e General theory and mathematical aspects – 89.75.Da Systems obeying scaling laws

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cities, Institutions, and Growth: The Emergence of Zipf’s Law

Zipf’s Law characterizes city populations as obeying a distributional power law and is supposedly one of the most robust regularities in all of economics. This paper shows, to the contrary, that Zipf’s Law only emerged in Europe between 1500 and 1800. It documents how Zipf’s Law emerged with the development of markets in the centuries preceding the onset of modern economic growth. It shows that...

متن کامل

Menzerath's law for the smallest grammars

The aim of this article is to develop a discussion of Menzerath’s law from the point of view of information theory. More precisely, we shall seek for links between the law and the recently abstracted mathematical problem of the smallest grammar (Kieffer & Yang 2000, Charikar et al. 2005). The Altmann-Menzerath law is a general statement about the natural language constructions which says: The l...

متن کامل

Zipf’s Law Revisited

Zipf’s law states that the frequency of occurence of some event as a function of its rank is a power-law function. Using empirical examples from different domains, we demonstrate that at least in some cases, increasingly significant divergences from Zipf’s law are registered as the number of events observed increases. Importantly, one of these cases is word frequency in a corpus of natural lang...

متن کامل

Can simple models explain Zipf's law for all exponents?

H. Simon proposed a simple stochastic process for explaining Zipf’s law for word frequencies. Here we introduce two similar generalizations of Simon’s model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the amount of mathematical background needed for deriving the exponent, compared to previous approaches to the standard Simon’s...

متن کامل

The use of Zipf’s law in animal communication analysis

I nformation theory has been discussed as a technique to analyse communicative processes or sequential behaviour of nonhuman animals, as in MacKay (1972), Slater (1973) and Bradbury & Vehrencamp (1998, chapters 13–15) among others. Recently, McCowan et al. (1999) proposed the use of information theory for their study of bottlenose dolphin, Tursiops truncatus, whistles. They discussed several as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004